摘要 人脸识别模型高度依赖大量敏感生物特征,面临显著的安全与版权风险.人脸识别模型后门水印技术虽广泛应用于模型版权验证,但现有方法多依赖“脏标签”策略植入水印,破坏数据语义一致性,容易被现有后门检测机制识别,限制实际部署.针对上述问题,文中提出基于干净标签的人脸识别模型后门水印方法(Clean-Label Backdoor Watermarking for Face Recognition Models, CBW2F),在无需修改任何样本标签的条件下,实现高隐蔽性与高鲁棒性的水印嵌入.首先,对部分样本施加人类视觉不可感知的对抗扰动,弱化方法对原始显著特征的依赖,促使其关注并记忆嵌入的后门触发模式.然后,引入结构化且视觉自然的彩虹滤镜作为触发器,通过与扰动的协同作用,在保持原有识别性能的同时实现有效的水印嵌入.实验表明,CBW2F能有效规避基于标签一致性的后门检测,并在多种水印去除攻击下仍保持较强的鲁棒性,多项指标值较优,可为人脸识别模型版权保护提供可行方案.
Abstract:Face recognition models are widely applied in critical areas, such as security authentication and intelligent surveillance. These models are faced with significant security and copyright risks due to their high reliance on sensitive biometric features. Backdoor watermarking technology for face recognition models is widely utilized for copyright verification, but most existing methods rely on dirty-label strategies. Consequently, data semantic consistency is destroyed, and the watermarks can be easily detected by current backdoor-detection mechanisms, which limit practical deployment. To address these issues, a clean-label backdoor watermarking method for face recognition models(CBW2F) is proposed in this paper. High imperceptibility and strong robustness are achieved without modifying any sample labels. Specifically, imperceptible adversarial perturbations are first applied to a subset of samples. The model dependence on original salient features is weakened, and the learning of the embedded backdoor trigger pattern is encouraged. A structured and visually natural rainbow filter is then introduced as the trigger. Through its cooperation with the perturbation, the model achieves effective watermark embedding while maintaining its original recognition performance. Experiments demonstrate that CBW2F effectively evades label-consistency-based backdoor detection and maintains strong robustness under various watermark removal attacks, including model fine-tuning and model distillation. It outperforms existing state-of-the-art approaches across multiple evaluation metrics, providing a practical solution for copyright protection in face recognition models.
[1] LECUN Y, BENGIO Y, HINTON G. DeepLearning. Nature, 2015, 521(7553): 436-444. [2] WEBB S.Deep Learning for Biology. Nature, 2018, 554(7693): 555-557. [3] TEWARI A, BERNARD F, GARRIDO P, et al. FML: Face Model Learning from Videos//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 10804-10814. [4] YAO A C.How to Generate and Exchange Secrets//Proc of the 27th Annual Symposium on Foundations of Computer Science. Wa-shington, USA: IEEE, 1986: 162-167. [5] LEDERER I, MAYER R, RAUBER A.Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(10): 13082-13100. [6] VAN DIJK M, GENTRY C, HALEVI S, et al. Fully Homomorphic Encryption over the Integers//Proc of the Annual International Conference on the Theory and Applications of Cryptographic Techniques. Berlin, Germany: Springer, 2010: 24-43. [7] YAN Y F, PAN X D, ZHANG M, et al. Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation[C/OL].[2025-08-07]. https://www.usenix.org/system/files/usenixsecurity23-yan.pdf. [8] LI Y M, BAI Y, JIANG Y, et al. Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection//Proc of the 36th International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: MIT Press, 2022: 13238-13250. [9] ADI Y, BAUM C, CISSE M, et al. Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring[C/OL].[2025-08-07]. https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-adi.pdf. [10] DOWLIN N, GILAD-BACHRACH R, LAINE K, et al. Crypto-Nets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy//Proc of the 33rd International Confe-rence on Machine Learning. San Diego, USA: JMLR, 2016: 201-210. [11] ZHENG Y T, LIN Y R, LU Y Q, et al. Efficient Privacy-Preserving Machine Learning with Homomorphic Encryption through Pruning//Proc of the 8th International Conference on Artificial Intelligence and Big Data. Washington, USA: IEEE, 2025: 293-298. [12] VAIDYA J, KANTARCIOĞLU M, CLIFTON C. Privacy-Preserving Naive Bayes Classification. VLDB Journal, 2008, 17(4): 879-898. [13] DU W L, HAN Y S, CHEN S G.Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification//Proc of the SIAM International Conference on Data Mining. Philadelphia, USA: SIAM, 2004: 222-233. [14] JAGANNATHAN G, WRIGHT R N.Privacy-Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data//Proc of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York, USA: ACM, 2005: 593-599. [15] CHEN H L, ROUHANI B D, FU C, et al. DeepMarks: A Secure Fingerprinting Framework for Digital Rights Management of Deep Learning Models//Proc of the International Conference on Multimedia Retrieval. New York, USA: ACM, 2019: 105-113. [16] UCHIDA Y, NAGAI Y, SAKAZAWA S, et al. Embedding Watermarks into Deep Neural Networks//Proc of the ACM International Conference on Multimedia Retrieval. New York, USA: ACM, 2017: 269-277. [17] WANG T H, KERSCHBAUM F.Attacks on Digital Watermarks for Deep Neural Networks//Proc of the IEEE International Confe-rence on Acoustics, Speech and Signal Processing. Washington, USA: IEEE, 2019: 2622-2626. [18] FAN L X, NG K W, CHAN C S, et al. DeepIPR: Deep Neural Network Ownership Verification with Passports. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 6122-6139. [19] YANG P, LAO Y J, LI P.Robust Watermarking for Deep Neural Networks via Bi-level Optimization//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 14821-14830. [20] ZHANG J, CHEN D D, LIAO J, et al. Passport-Aware Normalization for Deep Model Protection//Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 22619-22628. [21] HUA G, TEOH A B J, XIANG Y, et al. Unambiguous and High-Fidelity Backdoor Watermarking for Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(8): 11204-11217. [22] PENG W J, YI J W, WU F Z,et al. Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark//Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg. Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark//Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023, I: 7653-7668. [23] ZHANG J L, GU Z S, JANG J Y, et al. Protecting Intellectual Property of Deep Neural Networks with Watermarking//Proc of the Asia Conference on Computer and Communications Security. New York, USA: ACM, 2018: 159-172. [24] TURNER A, TSIPRAS D, MADRY A.Label-Consistent Backdoor Attacks[C/OL].[2025-08-07]. https://www.arxiv.org/pdf/1912.02771. [25] ZHAO S H, MA X J, ZHENG X, et al. Clean-Label Backdoor Attacks on Video Recognition Models//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 14431-14440. [26] ZHU M Y, LI Y M, GUO J F, et al. Towards Sample-Specific Backdoor Attack with Clean Labels via Attribute Trigger. IEEE Transactions on Dependable and Secure Computing, 2025, 22(5): 4685-4698. [27] YANG P, CHEN J B, HSIEH C J, et al. ML-LOO: Detecting Adversarial Examples with Feature Attribution. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 6639-6647. [28] CAO Q, SHEN L, XIE W D, et al. VGGFace2: A Dataset for Recognising Faces Across Pose and Age//Proc of the 13th IEEE International Conference on Automatic Face and Gesture Recognition. Washington, USA: IEEE, 2018: 67-74. [29] LI S, DENG W H, DU J P.Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2584-2593. [30] LIU Z W, LUO P, WANG X G, et al. Deep Learning Face Attri-butes in the Wild//Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 3730-3738. [31] NGUYEN T A, TRAN A T.WaNet: Imperceptible Warping Based Backdoor Attack[C/OL].[2025-08-07]. https://arxiv.org/pdf/2102.10369. [32] LI Y Z, LI Y M, WU B Y, et al. Invisible Backdoor Attack with Sample-Specific Triggers//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 16443-16452. [33] TANCIK M, MILDENHALL B, NG R.StegaStamp: Invisible Hyperlinks in Physical Photographs//Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 2114-2123.